^ ' METHOD AND APPARATUS FOR EXTRACTING FEATURE VECTOR USED FOR 

FACE RECOGNITION AND RETRIEVAL 



5 BACKGROUND OF THE INVENTION 

This application claims the priority of Korean Patent Application Nos. 
2002-62650 and 2003-26426, filed on October 15, 2002 and April 25, 2003, 
respectively in the Korean Intellectual Property Office, the disclosures of which are 
10 incorporated herein in their entirety by reference. 

1 . Field of the Invention 

The present invention relates to a face recognition and retrieval system, and 
more particularly, to a method and apparatus for extracting a feature vector used for 
15 face searching and recognition, which can overcome the limits of feature vectors 
generated in a frequency domain and a spatial domain by separately generating a 
Fourier feature vector and an intensity feature vector for a normalized face image 
and then merging the Fourier feature vector and the intensity feature vector. 

2. Description of the Related Art 

20 In the "information society" we now live in, personal information or specific 

groups* information are considered worth more than other properties. In order*to 
protect such valuable information from third persons, it is very important to develop a 
variety of techniques capable of efficiently recognizing the identify of people who 
attempt to access the information. Among currently available identification 

25 „ , technjques,.face. recognition has been considered as the most convenient and most- 
efficient identification method because it can be used to recognize a person's identify 
without letting the person realize that his/her identity is being checked and does not 
require a person to move or act in a specific way during an identification process. 
Even though social demands for end-user products, such as credit cards. 

30 debit cards, or electronic IDs, which inevitably require an identification process, have 
become stronger, only a few identification techniques, including password-based 
identification techniques, have been proven effective. Lack of alternative 
identification techniques to the password-based identification techniques has 
sprouted many social problems, such as identity-theft crimes using computers. 
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Since face recognition is believed to be able to solve these problems, it has been 
drawing tremendous attention from the public so far. In addition, face recognition 
has a great possibility of being applied to many different fields, such as terminal 
access control, public places control, electronic photo albums, and criminal face 
recognition. 

In the meantime, there are many different types of face recognition techniques. 
One of those techniques is a principal component analysis (PCA)-based face 
recognition technique. PCA is a method of compressing data by projecting image 
data onto a low-dimension vector space while minimizing data loss. The 
PCA-based face recognition technique is capable of recognizing a face image by 
extracting major feature vectors from the face image and classifying the face image 
into a predetermined category based on the extracted major feature vectors. 
However, this technique has problems, such as low recognition speed and low 
reliability. Even though this technique shows some degree of reliability Irrespective 
of variations in the lightness of a face image, it has failed to provide reliable and 
satisfactory face recognition results for different face expressions and poses. 

In order to overcome the limits of the PCA-based face recognition technique, 
a method of extracting feature vectors from a face image in a frequency domain by 
using principal component linear discriminant analysis (PCLDA) has been suggested 
by Toshio Kamei et al. in "Report of the core experiments on Fourier spectral PCLDA 
based face descriptor," ISO-IEC-JTC1-SC29WG1 1 . M8559, Klagenfurt, Austria, Oct. 
2002, and a method of extracting component-wise feature vectors in a spatial domain 
by using LDA has been suggested by Tae-kyun Kim "Component-based LDA Face 
Descriptor for Image Retrieval," British Machine Vision Conference 2002, September 
2002. These two methodjs, however, have distinct limitations.in terms of precision of 
face recognition and retrieval, partly because they solely take advantage of either 
frequency-domain feature vectors or spatial-domain feature vectors. 

SUMMARY OF THE INVENTION 
The present invention provides a method of extracting feature vectors for face 
recognition and retrieval, which can overcome the limits of feature vectors generated 
in a frequency domain and a spatial domain by separately generating a Fourier 
feature vector and an intensity feature vector for a normalized face image and then 
merging the Fourier feature vector and the intensity feature vector. 



The present invention also provides an apparatus for extracting a feature 
vector used for face recognition and retrieval, which uses the method of extracting 
feature vectors for face recognition and retrieval. 

According to an aspect of the present invention, there Is provided a method of 
extracting feature vectors for face recognition and retrieval. An entire Fourier 
feature vector Is generated for an entire face area of a normalized face image using 
first and second normalized vectors, and a central Fourier feature vector is 
generated for a central face area of the normalized face image using third and fourth 
normalized vectors. An entire intensity feature vector is generated for the entire 
face area, and a local intensity feature vector is generated for a predetermined 
number of face component-wise areas. An entire composite feature vector is 
generated by coupling the first and second normalized vectors and the entire 
intensity feature, and a central composite feature vector is generated by coupling the 
third and fourth normalized vectors and the local Intensity feature vectors. 

According to another aspect of the present invention, there is provided ah 
apparatus for extracting feature vectors for face recognition and retrieval, including a 
first unit and a second unit. The first unit generates an entire Fourier feature vector 
for an entire face area of a normalized face image using first and second normalized 
vectors, and generates a central Fourier feature vector for a central face area of the 
normalized face image using third and fourth normalized vectors. The second unit 
generates an entire intensity feature vector for the entire face image and a local 
intensity feature vector for a predetermined number efface component-wise areas, 
generates an entire composite feature vector by coupling the first and second 
normalized vectors and the entire intensity feature vector, and generates a central 
cornposite feature vector by cpupling.the third and fourth normalized ,vectors-and-the . 
local intensity feature vector. 

According to still another aspect of the present invention, there is provided a 
computer-readable recording medium on which a program enabling the method of 
extracting feature vectors for face recognition and retrieval is recorded. 

BRIEF DESCRIPTION OF THE DRAWINGS 
The above and other features and advantages of the present Invention will 
become more apparent by describing in detail exemplary embodiments thereof with 
reference to the attached drawings in which: 



FIG. 1 is a block diagram of an apparatus for extracting a feature vector used 
for face recognition and retrieval according to a preferred embodiment of the present 
Invention; 

FIG. 2 is a detailed block diagram of a Fourier feature generator shown in FIG. 

1: 

FIGS. 3A and 3B are detailed block diagrams illustrating a block-divided facial 
area shown in FIG. 2; 

FIG. 4 is a detailed block diagram of an entire Fourier feature vector generator 
shown in FIG. 1; 

FIG. 5 is a detailed block diagram of a central Fourier feature vector generator 
shown in FIG. 1; 

FIG. 6 is a detailed block diagram of an intensity feature generator shown in 

FIG. 1; 

FIG. 7 is a detailed block diagram of a pose estimation/compensation unit 
shown in FIG. 6; 

FIG. 8 is a detailed block diagram of an entire composite feature vector 
generator shown in FIG. 1 ; 

FIG. 9 is a detailed block diagram of a central composite feature vector 
generator shown in FIG. 1 ; and 

FIG. 10 is a diagram illustrating raster scanning, adopted by the Fourier 
feature generator shown in FIG. 1 . . 

DETAILED DESCRIPTION OF THE INVENTION 
The present invention will now be described more fully with reference to the 
- accompanying drawings in which preferred embodiments of the invention are shown.- 

FIG.. 1 is a block diagram of an apparatus for extracting a feature vector used 
for face recognition and retrieval according to a preferred embodiment of the present 
invention. Referring to FIG. 1 , the apparatus is comprised of a frequency feature 
vector generation unit 120 and a composite feature vector generation unit 130. The 
frequency feature vector generation unit 120 includes a Fourier feature generation 
unit 121, an entire Fourier feature vector generation unit 123, and a central Fourier 
feature vector generation unit 125, and the composite feature vector generation unit 
130 includes a Fourier feature generation unit 121, an intensity feature generation 



unit 131 , an entire composite feature vector generation unit 133, and a central 
composite feature vector generation unit 135. 

As shown in FIG. 1, a normalized face image 1 10 is obtained by scaling an 
original face image into, for example, 56 lines, in which case each line is comprised 
of, for example, 46 pixels. The center of the right eye in the original face image is 
located in the column and 16*^ row of the normalized face image 110, and the 
center of the left eye is located in the 24^ column and 31^^ row of the normalized face 
image 110. 

In the frequency feature vector generation unit 120, the Fourier feature 
generation unit 121 performs Fourier transformation on the normalized face image 
110, i.e., an entire face area and a central face area, thus obtaining the Fourier 
spectra and Fourier amplitudes of the entire face area and the central face area, 
defines first and secohd feature vectors using the Fourier spectrum and Fourier 
amplitude of the entire face area, and defines third and fourth feature vectors using 
the Fourier spectrum and Fourier amplitude for the central face area. The first 
through fourth feature vectors are projected onto a principal component linear 
discriminant analysis (PCLDA) sub-space and then are normalized into unit vectors. 
The first and second normalized vectors are feature vectors for the entire face area, 
and the third and fourth normalized vectors are feature vectors for the central face 
area. Here, Fourier feature vectors are obtained by encoding spatial relations 
between pixels constituting the normalized face image 110. 

The entire Fourier feature vector generation unit 123 combines the first and 
second normalized vectors, provided by the Fourier feature generation unit 121, into 
a single combined vector, projects jhe combined vector onto a discriminant space 
-defined by a predetermined-basis matrix, quantizes each component of-the projected 
vector in a predetermined manner, and stores the quantized vector as an entire 
Fourier feature vector. 

The central Fourier feature vector generation unit 125 combines the third and 
fourth normalized vectors, provided by the Fourier feature generation unit 121 , into a 
single combined vector, projects the combined vector onto a discriminant space 
defined by a predetermined basis matrix, quantizes each component of the projected 
vector in a predetermined manner, and stores the quantized vector as a central 
Fourier feature vector. 



. In the composite feature vector generation unit 130, the intensity feature 

generation unit 131 carries out pose estimation and compensation on the normalized 
face image 110, projects an entire face area and a predetermined number of 
components of the normalized face image 110, for example, 5 components of the 
5 normalized face image 110, among portions of the pose-compensated face image 
into the PCLDA sub-space, and normalizes the projection results into fifth and sixth 
unit vectors. An entire intensity feature vector for the entire face area is a fifth 
feature vector, which has been normalized into the fifth unit vector, and local 
intensity feature vectors for portions of the normalized face image 110 are sixth 

10 through tenth feature vectors, each of which has been normalized into the sixth unit 
vector. Here, the intensity feature vectors are obtained by encoding variations in the 
intensity of the pixels constituting the entire face area and the five components of the 
normalized face image 110. 

The entire composite feature vector generation unit 133 combines the first 

15 and second normalized vectors, provided by the Fourier feature generation unit 121 , 
and the entire intensity feature vector, provided by the intensity feature generation 
unit 131, into a single combined vector, projects the combined vector onto a 
discriminant space defined by a predetermined basis matrix, quantizes each 
component of the projected vector in a predetermined manner, and stores the 

20 quantized vector as an entire composite feature vector. 

The central composite feature vector generation unit 135 combines the third ' 
and fourth normalized vectors, provided by the Fourier feature generation unit 121 , 
and the local intensity feature vectors, provided by the intensity feature generation 
unit 131, into a single combined vector, projects the combined vector onto a 

25 discriminant.space .defined by. a predetermined basis rhatrix, . _ 

component of the projected vector in a predetermined manner, and stores the 
quantized vector as a central composite feature vector. 

FIG. 2 is a detailed block diagram of the Fourier feature generation unit 121 of 
FIG. 2. Referring to FIG. 2, the Fourier feature generation unit 121 includes a first 

30 image division unit 210, an entire Fourier feature vector generator 220, and a central 
Fourier feature vector generator 230. The entire Fourier feature vector generator 
220 includes first and second Fourier transformers 222 and 225, first and second 
PCLDA projectors 223 and 226, and a first vector normalizer 227. The central 
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Fourier feature vector generator 230 includes a third Fourier transformer 231 , third 
and fourth PCLDA projectors 233 and 234, and a second vector normalizer 235. 

As shown in FIG. 2, the first image division unit 210 divides the normalized 
face image 110 into an entire face area 221 , a block-divided face area 224, and a 
central face area 231 . 

The entire Fourier feature vector generator 220 obtains a Fourier spectrum of 
the entire face area 221 and Fourier amplitude of the block-divided face area 224 
and then generates an entire Fourier feature vector using the Fourier spectrum and 
Fourier amplitude, a process which will become more apparent in the following 
paragraphs. 

The first Fourier transformer 222 performs Fourier transformation on the entire 
face area 221, thus converting the entire face area 221 into a frequency-domain 
component. Supposing that the entire face area 221 is represented by f(x, y), the 
Fourier spectrum F(u, v) can be expressed by the following equation. 

nu,v)= ZZ/(^.J^)expU24 — +^ ..(1) 

X^y^Q \ NJ) 



In Equation (1),M=46,A^=56,w=0, 1,2, ...,andv=0. 1,2, ...,55. F(0, 0) 
indicates a DC component. The Fourier spectrum F{u, v) is used for obtaining a first 
feature vector x{ whose elements are defined by a real number portion Re[F{u, v)] 
and a imaginary number portion Im[F{u, v)] obtained by raster-scanning the Fourier 
spectrum F{u, v). More specifically, the raster scanning is carried out on all 
components of scanning areas A and B except for high frequency components (m=12, 
13, . ., 34); as defined in Table 1 below. - Table I shows raster scanning parameters 
used for extracting feature vectors from a Fourier domain. 



Table 1 



Feature 
Vector 


Feature 


Scan Area A 


Scan Area B 


Vector Dimension 


Sa 


Ea 


Sb 


Eb 


Subtotal 


Number 
of Blocl^s 


Total 




Re[F{u. v)] 


(0. 0) 


(11.13) 


(35. 0) 


(45. 13) 


322 




644 


Im[F{u, v)] 


(0. 0) 


(11.13) 


(35. 0) 


(45. 13) 


322 





f 


k(«,v)| 


(0. 0) 


(10. 13) 


(33. 0) 


(43. 13) 


308 


1 






^;(«,v)| 


(0. 0) 


(5. 6) 


(17. 0) 


(21,6) 


77 


4 


856 






^/(«.v)| 


(0. 0) 


(2.2) 


(9.0) 


(10. 2) 


15 


16 






Re[G{u, v)] 


(0. 0) 


(7.7) 


(24. 0) 


(31.7) 


128 


- 


256 




Im[G{u, v)] 


(0. 0) 


(7. 7) 


(24. 0) 


(31. 7) 


128 








G°(«,v)| 


(0. 0) 


(7. 7) 


(24. 0) 


(31.7) 


128 


1 








Gl(",v)| 


(0. 0) 


(3, 3) 


(12. 0) 


(15. 3) 


32 


4 


384 






Gl(«.v)| 


(0. 0) 


(1.1) 


(6. 0) 


(7.1) 


8 


16 





In Table 1, Sa and Sb represent starting points of the scanning areas A and B, 
respectively, and Ea and Eb represent ending points of the scanning areas A and B, 
respectively. Examples of the scanning areas A and B and the raster scanning are 
5 Illustrated in FIG. 10. . As shown In FIG. 10, Fourier components are extracted from 
the scanning areas A and B through raster scanning performed along a 'u direction. 

The first feature vector x{ , which Is obtained as a result of the raster 
scanning, can be expressed by Equation (2) below. Here, the first feature vector x{ 
Is a dimension of 644. 

0 

'Re[F(0,0)] 

N 

Re[FiM-X — 

_ " MF(0,0)] 

M>a O)] " 



-1)] 



-1)] 



(2) 
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The first PCLDA projector 223 projects the first feature vector x{ , obtained by 
the first Fourier transformer 222, onto a discriminant space obtained by carrying out 
PCLDA on the first feature vector x{ . The discriminant space is defined by a first 
basis matrix i^/ , which is obtained using a predetermined well-known algorithm and 
5 is also well known to those skilled in the art. 

The second Fourier transformer 225 converts the block-divided face area 224 
into a frequency domain by performing Fourier transformation on the block-divided 
face area 224. The block-divided face area 224, as shown in FIG. 3A, includes 
three sub-spaces, i.e., an entire area 31 1, a four-block area 312, and a sixteen-block 
10 area 313. The Fourier transformation is performed on each of the sub-spaces of the 
block-divided face area 224. Here, the entire area 31 1 , represented by /,°(jc, y) , is 
obtained by removing boundary rows of the normalized face image 110 and clipping 
a resulting face image into a 44x56 image. fi^{x,y) can be represented by 
Equation (3) below. 



20 sixteen-block area 313, which is represented by f^(x,y) , are obtained using the 
entire area 31 1. Specifically, the four-block area 312 is obtained by dividing the 
entire area 31 1 into four identical blocks having a 22x28 size. fl{x,y) can be 
represented by Equation (4) below. 

~~ -25 - ~ f{{x,y)^-f,\x^22sl2%tl^ ~-. .(4) .... 



In Equation (4), A:= 1, 2, 3. 4,x = 0. 1, 2, 21,;; = 0, 1, 2, .... 27, 

k-\ 

s\-{k- l)mod2 , and tl - round { ^ ) . 

The sixteen-block area 313, which is represented by f^{x,y) , is obtained by 
30 dividing the entire area 31 1 into 16 identical blocks having an 1 1x14 size. fkix,y) 
can be expressed by Equation (5) below. 
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f'ix,y)=f(x^hy) 



•••(3) 



In Equation (3), a: = 0, 1, 2 43, andy = 0, 1, 2, 55. 

The four-block area 312, which is represented by f^(x, y) , and the 



f,\x,y)^A\x^nsly^Uil) 



-•(5) 
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In Equation (5). A:= 1. 2. 3. .... 16.;c = 0. 1.2, .... 10,;; = 0. 1,2. .... 13. 

k- \ 

si = {k- l)mod4 , and t; = round{—^) . 

A Fourier spectRjm F/(u,v) and Fourier amplitude |f/(w, v)| are obtained as 

results of the Fourier transformation performed on the entire area 31 1 , the four-block 
area 312, and the sixteen-block area 31 1 by the second Fourier transformer 225 and 
can be expressed by Equations (6) and (7), respectively. 

(w, v)| = ^Re[F^\u,v)f^Im[F,'(u,v)f • • (7) 

In Equation (7), Re{z) and Im{z) represent the real number portion and 
imaginary number portion, respectively, of a complex number z. represents the 
width of the entire area 31 1 or the width of each sub-block of the four-block area 312 
or the sixteen-block area 313. For example, Af = 44, = 22, and = 1 1 . N^' 
represents the height of the entire area 31 1 or each sub-b|ock of the four-block area 
312 or the sixteen-block area 313. For example, = 56, A^^ = 28, and = ^4, 

A second feature vector x{ is obtained by performing raster scanning on the 

Fourier amplitude v)| except for high frequency components, as defined in 
Table 1 . The raster scanning is carried on Fourier amplitude values in the order of 
the Fourier amplitude of the entire area 31 1 v)| , the Fourier amplitudes of the 

four-block area 312 |F/(t/,v)|. |F2(w,v)|. (m, v)|, and |F4(w,v)|. and the Fourier 

amplitudes of the sixteen-block area 313 |/^^(i/,v)|. |F2^(w, v)|, .... l^^gC"'^)!- 
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The second feature vector x{ . which is obtained as a result of the raster 
scanning, can be expressed by Equation (8) below. Here, the dimension of the 
second feature vector x( is 856. 



|^'(1,0)| 

k'(43,13)| 
\f:(0,0)\ 



\f:(216)\ 

(0,0)1 

|^/ao)| 

.1^16(10,2)1 ; 



The second PCLDA projector 226 projects the second feature vector x{ , 
which is extracted by the second Fourier transformer 225, onto a discriminant space, 
obtained by carrying out PCLDA on the second feature vector x( The discriminant 
10 space is defined by a second basis matrix f/, which is obtained using a 

predetermined well-known algorithm and is also well known to those skilled in the 
art. 

The first vector normal izex 227 generates a first normalized vector y( by. _ 
normalizing the first feature vector, projected onto the discriminant space by the first 
15 PCLDA projector 223. In addition, the first vector normalizer 227 generates a 

second normalized vector y( by normalizing the second feature vector, projected 
onto the discriminant space by the second PCLDA projector 226. The first and 
second normalized vectors y{ and y{ can be expressed by Equations (9) and (10), 
respectively. 

20 
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^^ = T^7r^^-^ -00) 



In Equations (9) and (10), m{ and represent averages of vectors 
projected onto the discriminant space by the first and second PCLDA projectors 223 
and 226. The first and second normalized vectors y{ and y{ have dimensions of 
70 and 80, respectively. 

The central Fourier feature vector generator 230 obtains a Fourier spectrum 
and Fourier amplitude of the central face area 231 and generates a central Fourier 
feature vector using the Fourier spectrum and Fourier amplitude of the central face 
area 231 . The central Fourier feature vector generator 230 operates in a similar 
way to the way the entire Fourier feature vector generator 220 operates. The 
operation of the central Fourier feature vector generator 230 will become more 
apparent in the following paragraphs. 

The central face area 231 is obtained by clipping the normalized face image 
110, which is represented byy(x, y\ into a 32x32 image starting from a point (7. 12)' 
and ending with a point (38, 43). 

The third Fourier transformer 232 obtains a Fourier spectrum G(m, v) by 
converting the central face area 231 , which is represented by g{x, y), into a frequency 
domain through Fourier transformation. A third feature vector xf is obtained by 
performing raster scanning on the Fourier spectrum G(w, v). Here, the raster 
scanning is performed on the scanning areas A and B, as defined in Table 1 . The 
third feature vector jcf , which is obtained as a result of the raster scanning, is a 
dimension of 256. 

- The third- RCLDA- projector 233-projects the.third feature vector^.xf , which is _ . 

_e>lracted by the third Fourier transformer pbtsiified by 

carrying out PCLDA on the third feature vector . The discriminant space is 
defined by a third basis matrix F,^ , which is obtained using a predetermined 
well-known algorithm and is also well known to those skilled in the art. 

The fourth Fourier transformer 235 converts the block-divided face area 234 
into a frequency domain by performing Fourier transformation on the block-divided 
face area 234. The block-divided face area 234 includes three sub-spaces, i.e., a 
32x32-sized central area 321, which is represented by g^^{x,y) and have a 32x32 
size, a four-block area 322, which is represented by g\{x,y) and is constituted by 
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four identical blocks having a 16x16 size, and a sixteen-block area 323, which is 
represented by glix,y) and is constituted by sixteen identical blocks having a 8x8 
size, as shown in FIG. 3B. The Fourier transformation is performed on each of the 
sub-spaces of the block-divided face area 234, The central area 321 has a Fourier 

amplitude |G(w, v)|, and each of the sub-blocks gi(x,y) of the four-block area 322 

and the sixteen-block area 323 has an amplitude |g/(m, v)| . By performing raster 

scanning on these Fourier amplitudes, a fourth feature vector x| is obtained. The 
raster scanning is performed on the scanning areas A and B, as defined in Table 1 
above. Here, the fourth feature vector jcf , which is obtained as a result of the raster 
scanning, is a dimension of 384. 

The fourth PCLDA projector 236 projects the fourth feature vector xf , which 
is extracted by the fourth Fourier transformer 235, onto a discriminant space 
obtained by carrying out PCLDA on the fourth feature vector x| . The discriminant 
space is defined by a fourth basis matrix i^/ , which is obtained using a 
predetermined well-known algorithm and is also well known to those skilled in the 
art. 

In order to generate a unit vector using an average mf of vectors projected 
onto the discriminant space by the third PCLDA projector 233, the second vector 
normalizer 237 normalizes the third feature vector As a result of the normalization, 
a third normalized vector is obtained. In addition, in order to generate a unit 
vector using an average of vectors projected onto the discriminant space by the 
fourth PCLDA projector 236, the second vector nprmalizer 237 normalizes the fourth 

feature vector. -As a result of the normalization, a fourth.normalized.vector j;f is . 

obtained. The third and fourth normalized vectors yf and y^ have dimensions of 
70 and 80, respectively. 

FIG. 4 is a detailed block diagram of the entire Fourier feature vector 
generation unit 123 of FIG. 1 . As shown in FIG. 4, the entire Fourier feature vector 
generation unit 123 includes a first coupler 410, a first LDA projector 420, and a first 
quantizer 430. 

Referring to FIG. 4, the first coupler 410 couples the first and second 
normalized vectors y{ and y( , provided by the first vector normalizer 227 in the 
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Fourier feature generation unit 121 , into a single coupled vector having a dimension 
of. for example, 1 50, 

The first LDA projector 420 projects the coupled vector, provided by the first 
coupler 410, onto a linear discriminant space defined by a fifth basis matrix , 
which is obtained using a predetermined well-known algorithm and is also well 
known to those skilled in the art. A resulting projected vector / can be expressed by 
Equation (11) below. 



The first quantizer 430 clips and quantizes each component of the projected 
vector, provided by the first LDA projector 420, into a non-signed 5-bit integer, using 
Equation (12) below, and then stores a resulting vector as an entire Fourier feature 
vector w{ . 



FIG. 5 is a detailed block diagram of the central Fourier feature vector 
generation unit 125 of FIG. 1. As shown in FIG. 5, the central Fourier feature vector 
generation unit 125 includes a second coupler 510, a second LDA projector 520, and 
a second quantizer 530. 

_ Referring to-FIG. 5, the second coupler 510-COuples the third and fourth 

nornrialized vectors yf and yl, provided by the second vector normalizer 237 in the 
Fourier feature generation unit 121 , into a single coupled vector having a dimension 
of, for example, 150. 

The second LDA projector 520 projects the coupled vector, provided by the 
second coupler 510, onto a linear discriminant space defined by a sixth basis matrix 
, which is obtained using a predetermined well-known algorithm and is also well 
known to those skilled in the art. A resulting projected vector 2^ can be expressed 
by Equation (11) below. 




-(11) 



0 //z/<-16 
31 ifz(>\5 
floor {z{ +16) otherwise 



...(12) 
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The second quantizer 530 clips and quantizes each component of the 
projected vector , provided by the second LDA projector 520, into a non-signed 
5-bit integer, using Equation (13) below, and then stores a resulting vector as a 
central Fourier feature vector wf . 



wf = < 



0 if zf< -16 

31 if zf> 15 • •(13) 

floor (zf + 16) otherwise 



FIG, 6 is a detailed block diagram of the intensity feature vector generator 230 
of FIG. 2. As shown in FIG. 6, the intensity feature vector generator 230 includes a 
pose estimation and compensation unit 610, a second image divider 620, an entire 
intensity feature vector generator 630, and a local intensity feature vector generator 
640. 

Referring to FIG. 6, the pose estimation and compensation unit 610 estimates 
a pose of the normalized face image 110, compensates for the normalized face 
image 110 based on the result of the pose estimation, and outputs a resulting frontal 
face image. The pose estimation and compensation unit 610 fix mismatches 
caused by variation in the pose of the normalized face image 110. 

The second image divider 620 divides the pose-compensated face image 
output from the pose estimation and compensation unit 610 into an entire face area 
and first through fifth local images having face components 1 through 5, respectively. 
The entire face area has a predetermined raster scanning area, which is defined in 
Table 2 below. More specifically, the entire face area, starting from a point (0, 0), 
-has a-46x56 size. ^The first through-fifth local-images start from-points (9,-4), (6, 16),- 
(17. 16). (7. 25),_and (16,, 25), respectively, and„have 29x27, 24x21 . 24x21 , 24x24, . 
and 24x24 sizes, respectively. 

For example, the following table defines raster scanning areas and vector 
dimensions of face component-wise areas. 



Table 2 





Top Left 


Size 


Vector 




X 


y 


Width 


Height 


Dimension 


Entire Face: jc* 


0 


6 


46 


56 


2576 
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Face Component 1 : x^ 


9 


4 


29 


27 


783 


Face Component 2: 


6 


16 


24 


21 


504 


Face Component 3: Xj 


17 


16 


24 


21 


504 


Face Component 4: 


7 


25 


24 


24 


576 


Face Component 5: Xj 


16 


25 


24 


24 


576 



In the entire intensity feature vector generator 630, a first raster scanner 631 
generates a fifth feature vector , which is comprised of intensity values for the 
entire face area, by performing raster scanning on the entire face area along a 
5 column direction. The column-directional raster scanning is carried out starting with 
the top-left point (0, 0) of the entire face area and ending with the bottom right point 
(46, 56) of the entire face area. Here, the fifth feature vector has a dimension of, 
for example, 2576. 

A fifth PCLDA projector 632 projects the fifth feature vector x^ , provided by 
10 the first raster scanner 631 , onto a discriminant space defined by a seventh basis 
matrix , which is obtained using a predetermined well-known algorithm and is 
also well known to those skilled in the art. 

A fifth vector normalizer 633 normalizes the projected vector into a unit vector 
and stores the unit vector as an entire intensity feature vector. The unit 
15 vector y'' can be expressed by Equation (14) below. 

In Equation (14), represents an average of vectors, projected on the 
20 discriminant space by the fifth PCLDA projector 632, and has a dimension of 40. 

In the local intensity feature vector generator 640, second through sixth raster 

scanners 641a through 645a each generate a sixth feature vector xl (^ = 1 , 2 5), 

which is comprised of intensity values for each of the local face areas, by performing 
raster scanning on each of the local face areas in the column direction. The 
25 column-directional raster scanning is carried out starting with the top-left point of 
each of the local face areas and ending with the bottom-right point of each of the 
local face areas. The sixth feature vector xl (/: = 1 , 2, 5) has a dimension of 
783. 504, 504, 576, or 576 for each of the raster scanning areas defined in Table 2. 
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Sixth through tenth PCLDA projectors 641b through 645b project the sixth 
feature vector xl (/r = 1, 2, .... 5), provided by each of the second through sixth 
raster scanners 641a through 645a, onto a discriminant space defined by an eighth 

basis matrix iP/ (A: = 1, 2 5), which is obtained using a predetermined 

well-l<nown algorithm and is also well known to those skilled in the art. 

Sixth through tenth vector normalizers 641c through 645c normalize the 
vector, projected onto the discriminant space by each of the sixth through tenth 
PCLDA projectors 641b through 645b, into a unit vector yl (A: = 1 , 2, 5) and store 
the unit vector yl as a local intensity feature vector. The unit vector yl can be 
defined by Equation (15) below. 



(15) 



In Equation (15), ml represents an average of vectors, projected on the 
discriminant space by each of the sixth through tenth PCLDA projectors 641 b f 

V, 

through 645b, and has a dimension of 40. 1 

FIG. 7 is a detailed block diagram of the pose estimation and compensation 
unit 610 of FIG. 6. As shown in FIG. 7, the pose estimation and compensation unitf 
610 includes a pose estimator 710. which is comprised of n 

principal-component-analysis/distance-from-feature-space (PCA/DFFS) blocks 71 1, ^ 
712, and 713 and a minimum detector 714, and a pose compensator 720, which is 
comprised of an affine transformer 721 and an inverse mapper 722. 

The pose estimator 710 estimates the pose of the normalized face image 110 

to belong to one of the nine-pose classes defined-in Table (3) below, using a 

PCA/DFFS method. . For this, first through nPCA projectors 71 l a, 712a, and 713a 
(here, n = 9) project the normalized face image 1 10 onto a PCA sub-space of a 
projection matrix P, (/ = 1 , 2, . . ., 9) for each of the nine pose classes. A PCA model 
for each of the nine pose classes can be learned or obtained from exemplary images, 
collected from training face images. Face images having similar poses to those 
defined by the nine pose classes are determined as the distance for each pose 
class. 
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Table 3 



Pose ID 


Definition 


1 


Upward 


2 


Slightly Upward 


3 


Leftward 


4 


Slightly Left 


5 


Front 


6 


Slightly Rightward 


7 


Rightward 


8 


Slightly Downward 


9 


Downward 



First through n-th DFFS calculators 71 1b. 712b, and 713b (here, n = 9) 
calculate a distance di{x) for each of the nine pose classes using Equation (16) below. 
The distance di{x) represents how precisely a face can be represented by the PCA 
sub-space for a specific pose class. 

rf,(x)=||xf-||i>(x-M)|f -(16) 

In Equation (16), jc and Mi represent a vector obtained by column-wise raster 
scanning of the normalized face image result and an average vector of the PCA 
sub-space for a specific pose class (/), respectively. The projection matrix Pi {i = 1 , 

2 9) is obtained using a predetermined well-known algorithm and are also well 

known to those skilled in the art. 

- - The minimum detector-71 4 detects a minimum among distances y/,(jc),^^ . . . 
provided by the first through n-th DFFS calculators 71 1 b, 71 2b, and..71 3b (here, n = 
9), and estimates a predetermined pose class corresponding to the detected 
minimum as a pose class for the normalized face image 110, which is expressed by 
Equation (17) below. 

^\„in = argmin{c/,-(x)} /= 1,2,..., 9 • .(17) 

The above-described PCA/DFFS method is taught by B. Moghaddam and A. 
Pentland in "Face Recognition using View-based and Modular Eigenspaces," 
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Automatic Systems for the Identification and Inspection of Humans, SPIE Vol. 2277. 
July 1994. 

The pose compensator 720 compensates for the normalized face image 110 
into a frontal face Image according to the pose class estimated by the pose estimator 
710. To make this happen, the affine transformer 721 loads a predetermined affine 
transformation matrix that brings about a frontal pose class corresponding to the 
estimated pose class. Affine transformation from a pose class to a frontal pose 
class depends on correspondence points between the pose class and the frontal 
pose class. Each pose class may have, for example, average locations of 15 
distinctive face features, in detail, left and right edges at two eyebrows and eyes, left, 
right and bottom edges at nose, left, right, top and bottom edges at mouse as 
correspondence points. Such distinctive face features can be manually selected 
from a training image assigned to each pose class. 

The Inverse mapper 722 provides a pose-compensated face image 730 by 
geometrically Inverse-mapping the normalized face image 110 into a frontal face 
image using the predetermined affine transformation matrix loaded into the affine 
transformer 721 . Affine transformation from pose class (/) to the frontal face image- 
is represented by a six-dimensional parameter = {a, b, c, d, e,J}, which is defined in 
Table 4 below, and these parameters are calculated by the ratio of distinctive face h., 
features in the frontal face image to those in each pose face image. 



Table 4 



Pose 
Class 


a 


b 


c 


d 


e 


f 


1_ 


..0.991 580 . 


0.010128 


0- 074633 _ 


_-0,003959 


0.943660 


1.515700 


2 


0.970768 


-0.002972 


0.881806 


0.003942 


0.887513 


2.570872 


3 


1. 037537 


0.119828 


:4.445846 


-0.068323 


'1.242277" 


-5.35479"5" 


4 


0.996640 


-0.073498 


1 .742925 


0.004347 


1.033041 


-0.724001 


5 


1 .000000 


0.000000 


0.000000 


0.000000 


1 .000000 


0.000000 


6 


0.987388 


0.086766 


-1 .807056 


-0.003484 


0.998318 


0.264492 


7 


0.999838 


-0.128101 


3.728913 


0.013586 


1.185747 


-4.659860 


8 


0.984965 


-0.000953 


0.423803 


0.002269 


0.970589 


0.884239 


9 


0.978864 


-0.003004 


0.538113 


0.011342 


1.001916 


-0.477181 



19 



Intensity at a point (x, y) on the inverse-mapped face image can be calculated 
using bi-linear interpolation Formula (18) as described below. 

(l-^).{(l-^&)./(x\y)+^&./(x*+l,y)} + ^.{0 

In Formula (18), x = cei\{ax + by + c),y' = ce\\{d'X ey ^J), dx = {ax + 6y + c) 
- x\ and dy = {dx ey -^fj-y. f[x\ y) represents the intensity of the normalized 
face image 1 10 at a point {x\ y). 

FIG. 8 is a detailed block diagram of the entire composite feature vector 
generation unit 133 of FIG. 1 . As shown in FIG. 8, the entire composite feature 
vector generation unit 133 includes a third coupler 810, a third LDA projector 820, 
and a third quantizer 830. 

Referring to FIG. 8, the third coupler 810 couples the first and second 
normalized vectors y{ and y( , provided by the first vector normjalizer 227 in the 
Fourier feature generation unit 121 , and the entire Intensity feature vector;;^, 
provided by the entire intensity feature vector generator 630, into a single coupled 
vector having a dimension of, for example, 190. 

The third LDA projector 820 projects the coupled vector, provided by the third 
coupler 810, onto a linear discriminant space defined by a ninth basis matrix » 
which is obtained using a predetermined well-known algorithm and is also well 
known to those skilled in the art. A resulting projected vector can be expressed . 
by Equation (19) below. 



The third quantizer 830 clips and quantizes each component of the projected 
vector z^, provided by the third LDA projector 820, into a non-signed 5-bit integer, 
using Equation (20) below, and then stores a resulting vector as an entire composite 
feature vector w/". 



(19)- 
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0 //zf<-16 
31 //zf>15 
floor {z^ + 16) otherwise 



..(20) 



FIG. 9 is a detailed block diagram of the central composite feature vector 
generation unit 135 of FIG. 1 . As shown in FIG. 9, the central composite feature 
vector generation unit 135 includes a fourth coupler 910, a fourth LDA projector 920, 
and a fourth quantizer 930. 

Referring to FIG. 9, the fourth coupler 910 couples the third and fourth 
normalized vectors yf and yl , provided by the second vector normalizer 237 in the 
Fourier feature generation unit 121, and the local intensity feature vector yl, 
provided by the local intensity feature vector generator 640, into a single coupled 
vector that has, for example, a dimension of 350. 

The fourth LDA projector 920 projects the coupled vector, provided by the 
fourth coupler 910, onto a linear discriminant space defined by a tenth basis matrix 
which is obtained using a predetermined well-known algorithm and is also well- 



known to those skilled in the art. A resulting projected vector z"" can be expressed by 
Equation (21 ) below. 



The fourth quantizer 930 clips and quantizes each component of the projected 
vector, provided by the fourth LDA projector 920, into a non-signed 5-bit integer, 
using Equation (22) below, and then stores a resulting vector as a central composite 
feature vector . 




...(21) 
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0 
31 

floor (zf + 16) 



ifzf < -16 

ifzf>\5 

Otherwise 
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The first through tenth basis matrices that have been mentioned above (or the 
projection matrix) are taught by Tae-kyun Kim in "Component-based LDA face 
Descriptor for Image Retrieval." British Machine Vision Conference 2002, September 
2002, 

Hereinafter, the face image-retrieval performance of the method of extracting 
feature vectors according to the present invention will be described in greater detail. 

For an experiment on the method of extracting feature vectors according to 
the present invention, a set of MPEG-7 face image data, which is comprised of five 
databases, are used. These five databases are an MPEG-7 face image database 
(E1 ) of expansion version 1 , an Altkom database (A2), a MPEG-7 test set (M3) in an 
XM2VTS database, a FERET database F4. and an MPEG-7 test set (B5) in a Banca 
database. The total number of images that are used in this experiment is 1 1 ,845. 
Among the 1 1 ,845 Images, 3,655 Images are simply used as training images for LDA 
projection, and the rest are used as test images for evaluating the performance of an 
Image searching algorithm according to the present invention. Of those test Images, 
4,190 images are used as basis images for extracting face feature vectors and the 
others are used as images for face searching. Table 5 shows detailed information 
on training images and test images used in this experiment. General information on 
each image used in this experiment is given in advance for making it possible to 
evaluate the performance of the image retrieval algorithm according to the present 
invention.. _ ^ ........ .. ... ... - . - - 



Table 5 





DB 


Person 


Image 


Total 


Trainning 
Image 1 
50vs50 


Altkom 


40 


15 


600 


Banca 








MPEG 


317 


5 


1.585 


XM2VTS 


147 


10 


1.470 


FERET 








Total 




504 




3.655 
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Test Image 1 


Altkom 


40 


15 


600 


50vs50 


Banca 


52 


10 


520 




MPEG 


318 


5 


1,590 




XM2VTS 


148 


10 


1,480 




FERET 






4.000 


Total 




558 




8.190 



The precision of retrieval, carried out by the image retrieval algorithm 
according to the present invention, is measured based on average normalized 
modified retrieval rate (ANMRR), which is taught by B. S. Manjunath, Philippe 
Salembier, and Thomas Sikora in "Introduction to MPEG-7: Multimedia Content 
Description Interface, " John Wiley & Sons Ltd.. 2002. 

According to the experiment results, the retrieval precision of the image 
retrieval algorithm according to the present invention is 0.354 when using a Fourier 
feature vector only and is 0.390 when using an intensity feature vector only. When 
using both the Fourier feature vector and the intensity feature vector, the retrieval 
precision of the image retrieval algorithm according to the present invention is 0.266. 
The sizes of a conventional Fourier feature vector and the intensity feature vector 
are 240 (= 48 x 5) bits and 200(= 40 x 5)bits, respectively, while the size of a Fourier 
feature vector according to the present invention is 320 (= 64 x 5) bits. The retrieval 
precision of the image retrieval algorithm according to the present invention varies 
depending on the size of feature vector. For example, when the size is 48 (i.e., 240 
bits), ANMRR is 0.280. When the size is 64 (i.e., 320 bits), ANMRR is 0.266. 
When the size is 128 (i.e., 640 bits), ANMRR is 0.249. This means, as described 
before,Jhat using more size of feature vector increases retrieval precision while 
computational load is also slightly increased. Moreover, all of these results show' 
that it is possible to provide a precision retrieval technique and bring about excellent 
identification results by using both the Fourier feature vector and the intensity feature 
vector like in the present invention. 

The present invention can be realized as computer-readable codes that can 
be written on a computer-readable recording medium. The computer-readable 
medium includes nearly all sorts of recording devices on which computer-readable 
data can be written. For example, the computer-readable recording medium 
includes ROM, RAM, CD-ROM, a magnetic tape, a floppy disk, an optical data 
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storage, and carrier waves, such as data transmission through the Internet. The 
computer-readable recording medium can also be distributed over a plurality of 
computer systems that are connected to a network. Therefore, the 
computer-readable codes can be recorded on the computer-readable recording 

J5 medium and can be executed in a decentralized manner. 

As described above, according to the present invention, a frequency feature 
vector or composite feature vector can be selectively used depending on the 
specification of a face recognition and retrieval system. In addition, according to the 
present invention, it is possible to overcome the limits of conventional retrieval 

10 techniques using a frequency domain feature vector only or using a spatial domain 
feature vector only and to considerably enhance image-retrieval precision by 
separately generating a Fourier feature vector and an intensity feature vector for a 
predetermined normalized face image and then merging these two feature vectors 
into a single composite feature vector. 

15 While the present invention has been particularly shown and described with 

reference to exemplary embodiments thereof, it will be understood by those of' 
ordinary skill in the art that various changes in form and details may be made therein 
without departing from the spirit and scope of the present invention as defined by the 
following claims. 
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